LASR 2013 — Statistical Models and Methods for non-Euclidean Data with Current Scientific Applications
نویسندگان
چکیده
For inference on non-Euclidean data, ideally the data space is at least locally approx-imated by a Euclidean space; for a manifold by a chart, say. If suitable data descriptorslie asymptotically in such a common Euclidean neighborhood, one would expect a normalcentral limit theorem to hold. Already for locally flat spaces like the circle there are excep-tions causing smeariness. On stratified spaces, also singularities may prevent asymptoticnormality. While for shape spaces, due to relative increased curvature, singularity sets arerepulsive, for spaces with sudden drops in curvature, phenomena of stickiness arise. 1 Means on Stratified Spaces: Manifolds and Non-ManifoldsAt least since the middle of the last century (cf. Fisher (1953)), statisticians have begun toinvestigate non-Euclidean data such as circular data (e.g. wind directions), directional data,orientational data and shape data to name only a few. On “nice” spaces such as the circle,spheres and even planar shape spaces (which are complex projective spaces) parametric sta-tistical theory has provided for an ample collection of distributional models thus allowing formaximum likelihood estimation. In the sense that the parameters involved range over a subsetof a Euclidean space, this methodology for non-Euclidean data can still be called Euclidean.A different tack is persued by nonparametric statistical theory when it comes to estimatinga non-Euclidean quantity such as a typical direction, orientation, shape, et cetera via Fréchetmeans (cf. Fréchet (1948)). Notably this methodology also carries over, when non-Euclideanspaces are no longer “nice”, e.g. no longer symmetric spaces or even no longer manifolds asis the case for phylogenetic tree spaces (cf. Billera et al. (2001)) or Kendall’s shape spacesof threeand higher-dimensional objects (cf. Kendall et al. (1999)). In principle, there arethree types of Fréchet means: extrinsic, residual and intrinsic (for an overview cf. Huckemann(2011)). Extrinsic and residual means require an explicitly tractable embedding in a Euclideanspace, which is often unavailable. By their very nature their asymptotics are assessed by amixture of Euclidean and non-Euclidean methods. Intrinsic means, however, are available withevery concept of a (geodesic) distance and the corresponding statistical methodology can betruly called non-Euclidean. In view of asymptotics of intrinsic Fréchet means, we focus here on (a) the influence of mass near cut loci and (b) the influence of singularities and discontinuities of the curvature. In the following, let Q be a stratified space (e.g. Pflaum (2001)) with geodesic distanced : Q×Q→ [0,∞) and top manifold stratum Q∗ that is open and dense in Q. We assume thatQ∗ carries a Riemannian structure for whichd|Q∗ is the geodesic distance. By expq we denotethe Riemann exponential centered at q ∈ Q∗ which is a diffeomorpism from an open set Uq1supported by DFG HU 1575/2-1 and by the Felix Bernstein Institute for Mathematical Statistics in the Bio-sciences and the Niedersachsen Vorab of the Volkswagen Foundation
منابع مشابه
A Statistical Study of two Diffusion Processes on Torus and Their Applications
Diffusion Processes such as Brownian motions and Ornstein-Uhlenbeck processes are the classes of stochastic processes that have been investigated by researchers in various disciplines including biological sciences. It is usually assumed that the outcomes of these processes are laid on the Euclidean spaces. However, some data in physical, chemical and biological phenomena indicate that they cann...
متن کاملRegression Modeling for Spherical Data via Non-parametric and Least Square Methods
Introduction Statistical analysis of the data on the Earth's surface was a favorite subject among many researchers. Such data can be related to animal's migration from a region to another position. Then, statistical modeling of their paths helps biological researchers to predict their movements and estimate the areas that are most likely to constitute the presence of the animals. From a geome...
متن کاملAn Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملSpatial Analysis in curved spaces with Non-Euclidean Geometry
The ultimate goal of spatial information, both as part of technology and as science, is to answer questions and issues related to space, place, and location. Therefore, geometry is widely used for description, storage, and analysis. Undoubtedly, one of the most essential features of spatial information is geometric features, and one of the most obvious types of analysis is the geometric type an...
متن کاملFlexibility of Variations in Radial and Non-Radial Data Envelopment Analysis Models
One of the major problems in Data Envelopment Analysis (DEA) is to determine the projection of inefficient Decision Making Units (DMUs) into the efficient frontier. In conventional DEA models, inputs and outputs of inefficient DMUs alter arbitrarily for reaching to the efficient frontier. Nevertheless, sometimes the ability of DMUs is defined and restricted. Moreover, there are situations in th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013